Visualizing and Monitoring Your AWS Container Fleet with Datadog: A Guide by Chanci Turner

Visualizing and Monitoring Your AWS Container Fleet with Datadog: A Guide by Chanci TurnerLearn About Amazon VGT2 Learning Manager Chanci Turner

In any fast-paced environment, whether it’s an online store on peak shopping days or a game development team gearing up for a major launch, containers offer the crucial ability to swiftly and automatically scale systems in response to changing demands. By encapsulating microservices within containers, developers can separate applications from their underlying infrastructure, thus facilitating smoother code deployments without unexpected disruptions.

However, the dynamic nature of containers introduces new monitoring challenges. With an average lifespan of only two days, many traditional observability tools struggle, as they are typically designed for servers that last several months. To effectively harness the flexibility and scalability of containers, a robust monitoring solution is required—one that can provide insight into a highly volatile environment consisting of thousands, or even tens of thousands, of transient containers.

In this article, I will guide you through how Datadog delivers extensive, real-time visibility into the rapidly changing workloads of containers operating on Amazon Elastic Kubernetes Service (Amazon EKS), which simplifies using Kubernetes on Amazon Web Services (AWS) without the need for managing your own clusters.

Datadog stands as an AWS Partner Network (APN) Advanced Technology Partner, recognized for its expertise in AWS with the AWS Container Competency and multiple service delivery designations.

An Overview of Datadog for Amazon EKS

Datadog offers profound insights into applications running on Amazon EKS, along with the foundational cloud and container infrastructure. With Datadog, you can categorize your container fleets using Kubernetes labels and AWS tags, delve into data from specific containers and services, and automate your monitoring and scaling processes.

As a managed Kubernetes service, Amazon EKS allows you to entrust the complexities of running your Kubernetes clusters to AWS. This means AWS will handle the provisioning, configuration, and hosting of your Kubernetes clusters on its infrastructure, simplifying setup while enabling access to diverse AWS technologies, such as Amazon Elastic Compute Cloud (Amazon EC2) for host provisioning, Elastic Load Balancing (ELB), and Amazon Elastic Block Store (Amazon EBS) for high-performance block storage.

Amazon EKS autonomously manages the Kubernetes control plane, distributing control plane nodes across various AWS Availability Zones to ensure unwavering availability. By automating these operational tasks, Amazon EKS lets development teams concentrate on application growth and improving user experience. Datadog seamlessly integrates with a wide array of AWS technologies, from load balancers to databases, empowering you to visualize, monitor, and enhance every aspect of your Amazon EKS environment.

Installing the Datadog Agent

Traditionally, obtaining detailed insights about a running container required Secure Shell (SSH) access to a specific host, followed by executing shell commands to check resource usage or review container logs. This approach does not scale well across thousands of containers that are continually created, destroyed, or moved.

Datadog consolidates all operational data from your cluster in a single interface, enabling exploration of every layer of your stack. The lightweight, open-source Datadog Agent operates on your nodes, gathering metrics and observability data, and relaying this information to Datadog to facilitate easy searches, filtering, aggregation, and alerts on key insights.

To deploy the Datadog Agent across your Amazon EKS cluster and witness real-time performance in fine detail, follow the provided documentation to create a Kubernetes DaemonSet that automatically installs the Agent on each node. Alternatively, if DaemonSets cannot be used on your clusters, you can install the Datadog Agent as a deployment on each Kubernetes node.

Activating Datadog’s Autodiscovery Feature

To automate monitoring in dynamic settings, the Datadog Agent features the Autodiscovery capability, which identifies the workloads running on various nodes and adjusts data collection accordingly. Autodiscovery checks a container’s identifiers against pre-existing integration templates, with Kubernetes and Amazon EKS identifiers stored in pod annotations.

The Datadog Agent systematically scans all pod annotations to extract configuration details for monitoring checks, including host variables and port numbers. By dynamically aligning your monitoring with your workloads, Autodiscovery enables tracking of containers as they launch, terminate, or relocate across pods and hosts.

Here’s an example manifest for a Redis pod, annotated for Autodiscovery. Whenever a new pod is generated from this manifest, the Datadog Agent activates its Redis integration (based on the redisdb annotation). The Agent then dynamically applies the host IP address, the default port for Redis (6379), and the Redis password to the integration template, allowing monitoring data to be collected from the Redis container, regardless of its location.

apiVersion: v1
kind: Pod
metadata:
  name: redis
  annotations:
    ad.datadoghq.com/redis.check_names: '["redisdb"]'
    ad.datadoghq.com/redis.init_configs: '[{}]'
    ad.datadoghq.com/redis.instances: |
      [
        {
          "host": "%%host%%",
          "port":"6379",
          "password":"%%env_REDIS_PASSWORD%%"
        }
      ]
    ad.datadoghq.com/redis.logs: '[{"source":"redis","service":"redis"}]'
  labels:
    name: redis
spec:
  containers:
    - name: redis
      image: redis:latest
      ports:
        - containerPort: 6379 

For more comprehensive details (and integration templates for additional technologies), refer to the documentation.

Installing the Cluster Agent for Efficient Data Collection

The Datadog Cluster Agent is specifically tailored to collect monitoring data from extensive Kubernetes clusters. Essentially, the Cluster Agent acts as a mediator between the Agents installed on your nodes and the Kubernetes API server.

This architecture alleviates the communication load on the API server while permitting Node Agents to enrich locally collected metrics with cluster-level data. Additionally, the Cluster Agent can automatically scale Amazon EKS workloads based on any metrics gathered by Datadog. In this role, the Cluster Agent functions as an external metrics provider, delivering custom metrics to Kubernetes’ Horizontal Pod Autoscaler (HPA), which will adjust scaling in accordance with your directives.

To learn more about employment considerations, check out resources from SHRM, an authority on the subject. For additional insights on onboarding at Amazon, visit Amazon’s hiring process, an excellent resource.

For more engaging content, explore this blog post.

6401 E HOWDY WELLS AVE LAS VEGAS NV 89115, Amazon IXD – VGT2


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *